Biostatistics For Dummies, 2nd Edition (Monika Wahi, John Pezzullo)

CHAPTER 18 A Yes-or-No Proposition: Logistic Regression 265

As shown in Figure 18-7, the ROC curve always starts in the lower-left corner of

the graph, where 0 percent sensitivity intersects with 100 percent specificity. It

ends in the upper-right corner, where 100 percent sensitivity intersects with

0 percent specificity. Most software also draws a diagonal straight line between

the lower-left and upper-right corners because that represents the formula:

sensitivity

specificity

1 –

. If your model’s ROC curve were to match that line, it

would indicate the total absence of any predicting ability at all of your model.

Like Figure 18-7, every ROC graph has sensitivity running up the Y axis, which is

displayed either as fractions between 0 and 1 or as percentages between 0 and 100.

The X axis is either presented from left to right as 1 – specificity, or like it is in

Figure 18-7, where specificity is labeled backwards — from right to left — along

the X axis.

Most ROC curves lie in the upper-left part of the graph area. The farther away

from the diagonal line they are, the better the predictive model is. For a nearly

perfect model, the ROC curve runs up along the Y axis from the lower-left corner

to the upper-left corner, then along the top of the graph from the upper-left cor-

ner to the upper-right corner.

Because of how sensitivity and specificity are calculated, the graph appears as a

series of steps. If you have a large data set, your graph will have more and smaller

steps. For clarity, we show the cut values for predicted probability as a scale along

the ROC curve itself in Figure 18-7, but unfortunately, most statistical software

doesn’t do this for you.

Looking at the ROC curve helps you choose a cut value that gives the best tradeoff

between sensitivity and specificity:»

» ^{To have very few false positives:}^{Choose a higher cut value to give a high}

specificity. Figure 18-7 shows that by setting the cut value to 0.6, you can

simultaneously achieve about 93 percent specificity and 87 percent sensitivity.»

» ^{To have very few false negatives:}^{Choose a lower cut value to give higher}

sensitivity. Figure 18-7 shows you that if you set the cut value to 0.3, you can

have almost perfect sensitivity because you’ll be at almost 100 percent, but

your specificity will be only about 75 percent, meaning you’ll have a 25 percent

false positive rate.

The software may optionally display the area under the ROC curve (abbreviated

AUC), along with its standard error and a p value. This is another measure of how

good the predictive model is. The diagonal line has an AUC of 0.5, and there is a

statistical test comparing your AUC to the diagonal line. Under α = 0.05, if the p

value < 0.05, it indicates that your model is statistically significantly better than

the diagonal line at accurately predicting your outcome.